Citation: CPT: Pharmacometrics & Systems Pharmacology (2013) 2, e76; doi: 1 0 . 1 038/psp .2013.52 
© 2013 ASCPT All rights reserved 2163-8306/12 



www.nature.com/psp 

ORIGINAL ARTICLE 

Medication-Wide Association Studies 

PB Ryan 1 , D Madigan 2 , PE Stang 1 , MJ Schuemie 1 3 and G Hripcsak 4 

Undiscovered side effects of drugs can have a profound effect on the health of the nation, and electronic health-care databases 
offer opportunities to speed up the discovery of these side effects. We applied a "medication-wide association study" approach 
that combined multivariate analysis with exploratory visualization to study four health outcomes of interest in an administrative 
claims database of 46 million patients and a clinical database of 11 million patients. The technique had good predictive value, 
but there was no threshold high enough to eliminate false-positive findings. The visualization not only highlighted the class 
effects that strengthened the review of specific products but also underscored the challenges in confounding. These findings 
suggest that observational databases are useful for identifying potential associations that warrant further consideration but are 
unlikely to provide definitive evidence of causal effects. 

CPT: Pharmacometrics & Systems Pharmacology '(2013) 2, e76; doi: 10.1038/psp.201 3.52; published online 18 September 2013 



The increasing adoption of electronic health records (EHRs) 1 
and the availability of other data sources, such as administra- 
tive claims data 2 and spontaneous adverse drug event report- 
ing systems, 3 promise a new era of medical discovery. 4 One 
area that has shown concrete progress is pharmacovigilance. 5 
Adverse drug events represent a huge health and economic 
cost to the nation. 6-8 It is simply not possible to detect all pos- 
sible drug side effects in the drug-approval process because 
of small sample size, narrow study populations, and limited 
time course. Postmarket surveillance of drug safety — that 
is, pharmacovigilance — promises to detect important side 
effects as soon as possible to minimize the damage. 

Before regulatory approval, while a drug is in development, 
randomized clinical trials represent the primary sources of 
safety information. Such experiments are generally regarded 
as the highest level of evidence, leading to an unbiased esti- 
mate of the average treatment effect. 9 Unfortunately, most tri- 
als suffer from insufficient sample size and lack of applicability 
to reliably estimate the risk of other potential safety concerns 
for the target population. 10 " As a result, new evidence about 
safety is required even after a medical product is approved. 

A number of techniques have been developed to infer drug 
side effects from large databases in the postapproval setting. 12 
Spontaneous adverse event reporting databases comprise 
voluntary reports of a suspected relationship between adverse 
effects following medical product exposure. As a result, these 
spontaneous databases present challenges in analysis, 
because there is no defined population from which to base 
the denominator when estimating reporting rates. The reports 
reflect a nonrandom sample from the total patients exposed 
and the total patients who have experienced the adverse 
event, but neither totals are reliably obtained. Disproportion- 
ality analysis methods for spontaneous adverse event report- 
ing data were established as an approach to account for the 
lack of denominator by using the universe of all reports as a 
proxy to estimate the expected number of events that could be 



compared with the true observed count. Longitudinal obser- 
vational health-care databases, such as administrative claims 
and EHRs, offer opportunity to define a population over time, 
enabling the estimation of background rates of events and 
drug utilization patterns, which can then be used as denomi- 
nators for evaluating the strength of association between 
exposure and outcomes. However, retrospective observational 
database analyses suffer from a multitude of potential sources 
of bias due to the data capture process and heath-care deliv- 
ery system. For example, it is common that the indication for 
a drug may bias the estimated association if it is associated 
with an increased risk of the outcome itself. 13 Propensity score 
adjustment, 14 self-controlled designs, 12 and domain knowledge 
(e.g., indications) 15 are commonly used to reduce confound- 
ing; however, health records have unreliable timing, and indi- 
cations may be correlated so that a second indication may be 
confused with a side effect. Pharmacovigilance also presents 
the challenge of multiplicity, as there are >1 ,500 active ingredi- 
ents in prescription medications and each requires monitoring 
for thousands of potential side effects; however, simultaneous 
evaluation of millions of statistical tests is likely to produce 
many false-positive findings due to chance alone. A number of 
techniques for addressing multiplicity, including false discovery 
rate analysis, 16 have been suggested. 

The consequence of dependencies, confounding, and other 
"noise" is an unacceptably high false-positive rate. The state 
of the art for pharmacovigilance on the Observational Medi- 
cal Outcomes Partnership (OMOP) 17 databases, which cover 
140 million lives, produces areas under the receiver operating 
characteristic curve of almost 0.8. 18 Even with a high threshold 
(relative risk > 2), which led to an average sensitivity of 0.28, 
the average specificity was only 0.87 and the average positive- 
predictive values reached only 0.51 . Therefore, the discovery 
of an adverse event association through mining even very 
large databases cannot be used to directly infer actual risks. 
At best, the method generates a smaller pool of hypotheses 



1 Janssen Research and Development, Titusville, New Jersey, USA; department of Statistics, Columbia University, New York, New York, USA; 'Department of Medical 
Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands; 4 Department of Biomedical Informatics, Columbia University, New York, New York, USA. 
Correspondence: G Hripcsak (hripcsak@columbia.edu) 

Received 29 April 2013; accepted 9 August 2013; advance online publication 18 September 2013. doi:10.1038/psp.2013.52 



Medication-Wide Association Studies 

Ryan etal. 



that warrant further study. The volume of hypotheses when 
applied to all potential outcomes across the entire formulary 
of drugs, however, is likely to be in the hundreds or thousands. 

High-visibility drug market withdrawals, such as that of 
rofecoxib, 19 have led investigators to assess when its side 
effects could have been discovered according to various 
databases. 20-22 Retrospective assessments of the early 
appearance of a signal are common in the literature but are 
misleading as the investigation focuses on a single "known" 
signal rather than establishing the context of looking for these 
signals across an entire set of exposures and outcomes: these 
studies fail to account for the potential false-positive rate that 
would occur if the same method were similarly applied to all 
other drugs for the same outcome. Schuemie etal. have shown 
substantial risk of both false-positives and false-negative 
results when establishing decision thresholds near the effect 
size where rofecoxib signaled. 23 Removing all drugs from the 
market whose relative risk confidence interval exceeds one or 
some other threshold is likely to cause more harm than good. 

At this point in time, the only possible approach is to manu- 
ally review and prioritize generated lists of hypotheses. Experts' 
domain knowledge of pharmacology, physiology, and health care 
may help in addressing issues such as confounding between 
indications and side effects. In the past, we have used bar plots 12 
and forest plots 23 to better visualize and interpret pharmaco- 
vigilance results, but those approaches fall short because they 
convey no domain knowledge (indication and structure). 

Genome-wide association studies identify relevant genetic 
changes associated with disease states from among the thou- 
sands to millions of potential sites. The typical visualization of 
these associations shows the statistical significance (-log P 
value) of the target sites compared with all others, where the 
sites are organized by their placement in the genome (see 
for example, Ikram et al.). 2 * The organization places sites 
within genes near each other and places sites that are geneti- 
cally linked near each other. The visualization approach was 
adopted for clinical associations in the so-called phenome- 
wide association studies. 25 These are an inverse of a genome- 
wide study, in which a single genetic locus is compared with 
all possible phenotypes. It is organized by clinical system, 
often using the International Classification of Diseases, 9th 
Revision, Clinical Modification 26 for organizing the pheno- 
types so that those affecting similar systems are colocated. 

Using an approach based on genome- and phenome-wide 
association studies, we propose a "medication-wide associa- 
tion study" (MWAS), in which each side effect is compared 
with all drugs available for comparison. We organize the drugs 
by the Anatomical Therapeutic Chemical Classification Sys- 
tem, 27 which groups drugs both by the organ system on which 
they act and by their therapeutic characteristics and chemical 
structure. We applied a self-controlled case series (SCCS) 
analysis to 6 years' data from two observational health-care 
databases — the Truven MarketScan Commercial Claims and 
Encounters (CCAE) administrative claims database with 46.5 
million lives, and the GE Centricity EHR database with 11.2 
million lives 18 — and four clinically important side effects: acute 
myocardial infarction, acute liver failure, acute renal failure, 
and upper gastrointestinal ulcer. We plotted drugs for which 
we had ground truth of either known side effects or known lack 
of side effects according to appropriately powered studies. 



RESULTS 

Figure 1 shows the four side-effect plots for the Truven Mar- 
ketScan CCAE database. For myocardial infarction, a number 
of true associations (star markers) are above the threshold of 
P < 0.05, but there appears to be a class-specific tendency to 
display (e.g., anti-inflammatory) or not display (e.g., psychoa- 
naleptics) an effect. Negative controls (circle markers) show 
P values almost as extreme as the true associations. For 
acute liver failure, the results are similar, with some classes 
with known effects displaying it and others not, and with a 
false-positive as high as the highest true-positives. Acute 
renal failure is similar. Upper gastrointestinal ulcer performs 
better with few notable false-positives. 

Figure 2 displays the P-value plots across the negative 
controls for each of the four outcomes. In all the cases, the 
proportion of tests with P < 0.05 is substantially higher than 
the 5% expected, indicating that these observational analy- 
ses do not satisfy the standard assumptions of independent 
and unbiased estimators. 

Figure 3 compares the results for CCAE and the GE Cen- 
tricity database. For each drug, a line connects the results for 
the two databases, with the larger marker representing the 
CCAE database. In general, the CCAE P values are lower in 
value and therefore higher on the MWAS plots, likely because 
the database has a larger sample size and more complete 
data capture of health service utilization. The combination of 
the two databases does not appear, however, to help distin- 
guish between positive and negative controls. 



DISCUSSION 

Observational health-care databases are commonly used 
for evaluating specific hypotheses about potential drug 
safety issues, but only recently has the research community 
sought to systematically explore these data to proactively 
identify safety signals. In 2007, the US Congress passed 
the Food and Drug Administration Amendment Act, which 
required the Food and Drug Administration to establish a 
"postmarket risk identification and analysis system" with 
access to >100 million lives of electronic health-care data. 28 
In response, the Food and Drug Administration established 
the Sentinel Initiative, which has made progress toward 
developing a national data infrastructure, but has not yet 
conducted medication-wide analyses to identify potential 
safety concerns. 29 Our work illustrates a proof-of-concept 
approach for signal generation that can enable standardized 
surveillance of specific health outcomes of interest across 
all medical products. 

Our MWAS visualizations demonstrate both the oppor- 
tunity and challenge of pharmacovigilance in these large 
health-care databases. Most of the signals identified in these 
analyses were positive controls that we would hope a system 
would detect, and the majority of negative controls failed to 
yield statistically significant false-positive associations. This 
performance reflects the previously documented predictive 
value of up to 0.8. 18 

Nevertheless, for each outcome, we observed a large 
number of drugs known not to have side effects that did have 



CPT: Pharmacometrics & Systems Pharmacology 



Medication-Wide Association Studies 

Ryan eta/. 



* 



« 



O To 
O g 



« 



« 

« 
« 



a|0ZBUj|L|ia/ij 
mdoJiuAsoo 
a|ozeuoDiiA| 
aujiueiodoos 



U!DBXO|)!lBO 



euoseoiiny 
lse>in[J!jEZ 

|0J9]9LU|BS 

aiBieuozusg 

BUipElBJO"! 

oisBqoijouj 'eiBi|dsoL]d wnipos 
sesGpiidsdopug 

S9SE|AWV 
LUBdeZBLUBl 
U0ei|BUJBt| 
BUIZBJSdjOILiaOJd 

ppuadcua 

B|Bd9ZEJ0|L|0 

3ui|A)duiJ0N 

3UILUBjd!UJ| 
3U!UJBjd!S3a 

3U!id!J30Luojg 

BUOpjLUlJd 

suiesoiud 

UEtduijiuiaz 

UBtduiBLuns 

ppB D!|AD!|BS|fa!iBS 

UB)d|J)BZ!tJ 
UBtdU(BJBf>J 
UBtdutBAOJJ 
UEii'h 

IBSjunma 

UB|duiouj|v 

miswiqi 

LUBSiXOjy 

uspjdoia>| 
U9|0jd|qjny 

lOHIEqiBDOLJlBlftJ 
BUO|EXE|B|rt| 

oepugns 

UjZOjdBXQ 

Buo)9UjnqBN 

DB10JO|B» 

BupBqiaujopui 
uajaiclouaj 

0E|0pO13 

umAjnqAxo 

B1BXOAEIJ 
WOBUajUEC] 

BiBdjdoJisg 

(dsn) ps(B6n[uoD 'SU960JIS3 

|0|pE4S3 
BOJfl 

, ouijcuiqjoi 
auidjpujiN 
ouidipaiujv 
Hjdixoon 
QioujBpuAdia 
B(iv uiiaoda 

'■'II'-' 1.1; v :: ■>::.<• ; ' 
B|OZBpfU!l 



9U|dBJ|A9N 
B|OZBUOD013>| 

A U![|!0|U3d 

3U!LUEU9L|l91ft| 
UjOALUBpUHQ 

BsoiniSE - ! 
u!td||6Et!S 



buiweAosoAh 

9)BJ|EJ3nS 

|OU!qBUUBDOjpAl)EilBl 

aiuzeiesBdng 



AdVH3Hl OIOHAH1 

'HlOdAH am AHVlinild 
S1VOI OOIOIO 



sivoi ooiomvHiHdo 



SNOiivyvdayd ivsvn 
■ 'AVMHIV 3AiionHisao 

HOd SOflHa 

■gyd anoo am Honoo 

3Sfl 0IVM31SAS 

HOd S3NILW1SIHI1NV 



31SAS 
AHOIV 
HldS3H 



SOIld31OH0ASd 



SOIld31VNVOH0AHd 

sonua NOSNiMdVd-iiNV 

S0lld3Hd3llNV 
SOI13H1S3NV 



S0IS30~IVNV 



NlVd 

yvnnosnwaNV iniop 
yod sionaodd ivoidoi 



SlNVXVH3d 3~I0SI"W 



sionaond 

OllWVn3HHIlNV 

QNV AHOlVWVWHdNlllNV 



sivoiooioHn 

LN31SAS "1V1IN30 

3Hido sdoivinaow 

am S3NOLNHOH X3S 

S3All0310tld 

QNV SlN3mOVM3 

"■H3a HOd SIVONfldllNV 



"HI NO 0NI10V S1N30V 

■"39V OIlOaVMOHHlllNV 

SNOIlVdVd3Hd 

0IW3NVI1NV 

SlVOZOlOHdllNV 

S0I1NILATI3H1NV 

■ '^ISAS HOd S1VHIAI1NV 

3Sfl 0IVM31SAS 

HOd SOIIOOAWIINV 



■vi3d aiov HOd sonna 

"UNV am SOI13W3I1NV 
■■S31NI'SnV3HHHViailNV 



S3N 

OWHOH 

X3S 

am w 

31SAS 



S~IVO 
100101 
VLNH30 
" AS tlV 

nnosvA 

OldHVO 

"OOIQ 
ONV 

aoona 
"I 'sio 
naoHd 

0111 SVH 
VdllNV 

3sn 

OIVM 
31SAS 
HOd S 
3AI103 
dNHNV 



wsno 

9V13W 

am 

10VH1 

AdVl 

N3HI1V 



£ in 



2 s 



< en 
Z O 
O I 



\ T- < < < 



< m 



TO 



||n) d 



www.nature.com/psp 



Medication-Wide Association Studies 

Ryan et ai 




loejnoimiAJojd 

siozeiiiima^i 

uidojiuAsoQ 

uobxoijo 

siozBuooiinj 

3UIBUO/UOg 
3Uii ui'|< 
uorxoijjon 
uiobxoijoaot 
uo|iioiaii 
uioexoijubo 
.1 ii. ., ' i I 
ouuodsap^Q 

3plLUB|OZEl33V 

uoexoipidiQ 
ouosBOBny 
lSB)(n[juBZ 

|OJQlQLU|Bg 

aiBiBuozu^g 

siSBqauouj '3(Eqdsoqd wmpos 
osBdn 

suiii:piidadopu3 
sosbiAujv 
lopuadtwa 
3ui|AiduuoN 

3UPpOZB(3N 
3UILU&I1SO0N 

stBoidieA 

SUPPILUIJd 
. 3Ul6l40lUE| 
SUldQZGLUBqjBQ 

yuiLuuicCjg 
uBidiaoujiv 

. UQ3UJ|P1 
. UJVOI KOllcl 
USKOldBN 

uotojdnqi 
OBpuiing 
uizojdBXO 

3B|OJO)8>| 
JI3PL|t0LU0pj| 
3BIOP013 
qiKCOBIBQ 

louundoiiv 
uiuAinqAxo 

81BXOABIJ 

suijEuiqjsi 
.i -|':,'. • i' : 

poe diuiiooin 

UUIS0U3PV 

BuidipoiiN 
ujazBi«ia 
BdopiAqiu^ 

. |udB|OpuBJl 

ludiujtry 
. lutteumQ 
i ' ■ 11 

||jdO|dBQ 

urejiiinsiQ 

3IOZBPIUIJ. 

31B 1311014)3^ 

qBLUWI|(U| 

El e)3q U0J3p3|U| 

U3(IXOLUBl 

qiuiiBuji 

i :vii ii. : 
uejinsng 
qiiajoz3uog 
BUiprwopiz 

3UIP0ACIS 

ouidBjwoiy 

3UipnAILUB~| 
ZU3J|AB(3 

auisoucpia 
jiABOtqv 

8IOZBU03IIOA 
3IOZBUP3BJH 
3|0ZBU03n|J 

uidujt;)iy 
3|OZBiosi)ins 

A U!l|l3IU3d 

uioiuBintojijN 

UuiLUUuoqWini 
UI3BXO|jJLU3Q 
0S0|n)3G"| 

uildi|6Eiis 

3U0ZUIII60IC) 
.i.i.ui'A v.. A|| 
BUILUOpADIQ 

uiii)|i;jjny 

3UILUJ31U3L|d 
. IBISIUO 

, o j iq buu i 'jo j p Al > ■ i ; j j 



AdVH3H± aiOHAHl 
■"OWHOH OILWIVHIOdAH 
ONV AHVlinild 

S1V0I901010 



snvoisonomvHiHdo 



SNOIlVyVdEHd 

ivoiooioioonv 
nvoioonomvHiHdo 

SN0llVHVd3Hd "1VSVN 

■"iionHisao aod ssnaa 

SNOIlVyVdEHd 

anooaNVHonoo 



soiidanoHOASd 

SOIldElVNVOHOAHd 

sonya vmeisas 

SflOAUEN HEH10 
SOIldEHdEllNV 



S0IS301VNV 
NlVd 

yvinosniN onv iNior 
HOd sionaoHd ivoidoi 



sionaodd 

□llVWnEHHUNV 

ONV AHOlV^WVHdNlllNV 



s~ivoioo~ioan 

Esn nvoioonoivnyEa 

HOd snvsNndiiNV 

S1NE9V ONIAdiaOW aidn 
AdVHEHl OVIOHVO 

■■'ona iennvho wnioivo 

SEAISNElHEdAHIlNV 
VME1SAS 

NISNEIOIONV-NINEH 
EH1 NO DNI10V S1NESV 

"ESNI 'SEOIDiaVOS '10NI 
'SEaiOlllSVHVdOlOE 
SlVOZOlOHdllNV 
SlNVSS3yddnS0NfllA|tMI 

siNvnnniisoNnwiM 

AdVH3H13NiaOOaN3 
S1NESV OIlSVHdOENIlNV 



■"10X3 
'Sd3Hd 

~ivn 
owyoH 



31SAS 
AHOIV 
UldS3U 



S1VIH310VaOOAlA|llNV 



"V13U aiov yoj sonya 

■VyVdSHd A±IS3aOllNV 
"IINV QNV S0I1EVMEI1NV 



"ISAS 
yvNian 

Oil NED 

S~1V0 

100101 

vwysa 



■■■ssaio 

LLOESNI 

'S10 

naodd 
omsvy 

VdllNV 

s 

1NE0V 
DNI1V1 

naowo 
NnwiAii 
onv oi 

lSVHdO 
EN IINV 



E1SAS 

yod s 

EAI10E 
dNUNV 



wsno 
a view 

ONV 

lovyi 
Ay vi 

NElfllllV 



CPT: Pharmacometrics & Systems Pharmacology 



Medication-Wide Association Studies 

Ryan eta/. 



5 



O -r- 9 

•« 1 



1 



5 O 
5 
O 



« 



• 


eumoozuag 


S1V0IOO1O1O 


• 


sinwEjodoas 




• 

• 


uej!to(3» 

uioExoitijeo 

suuodsopAo 


sivoi oonomvHiHdo 


• 
• 
• 


31B]EU0ZU3g 

suipejEJO-i 
esedn 


SNOI±VHVd3Hd 

anoo onv Hsnoo 
3sn 

OIVM31SAS HOd S3NIVW1S1HI1NV 


• 


sesepiidedopug 


THIN 


• 


S3SB]AWV 




• 


WBdaZBW31 




• 


U03J|3UJEtj 
SU]ZBJ9djOm30Jd 


SOIld310HOASd 


• 


siBdazBJOino 




• 


9U!|Aidi.i!.io|j 
eu]UJBJd!LU| 


SOIld31VNVOHOASd 


• 


eu|iii6iiso3N 


ssnaa 

LN31SAS SHOAH3N H3HJ.O 


• 




SOI13H1S3NV 


• 


ue)duiez!tl 




• 


UEldUJ3|3 


SOIS3EHVNV 




IBSiunma 

LUBOiXOJjd 






uexojdBN 

U9JOjd01B» 

Liejoidnqi 


NivdHvnnosnw onv 

lNior HOd sionaoHd nvoidoi 




|OWBqjBDOl||3[A| 

euo|Bxei9^j 


sinvxvheh Eiosnn 




utzojdexo 






LUBOixoie^ 

OE|OJ01S>| 


SlOnaOHd OliVLAIflSHHIlNV 
ONV AHOlVVWIdNILLNV 




os|opoig 




« 


louundonv 


SNOI±VHVd3Hd inOOHNV 


• 


UpSUBJUBa 


sivoiDoionn 




Bejn 


S3AllD3±OHd am S1N3ITIOVM3 


• 


uirt]nj09S!J0 

ep [ZB| mo j o [ipoj pA h 

9U|SOU9pV 


Esn ~ivoie>o~io±vlnh3ci 
HOd snvoNndiiNV 
soiiEHnia 

AdVH3H± OVIOHVO 


« 


UBJJBS!UJ|S1 
HlUOXOpSUI UBJJBSSLUIO 






Ijjdixeo^ 

li.idouisn 


LN31SAS NISN310IONV 

-NIN3H 3H1 NO ONIIOV S1N3SV 


« 


HJdoidBO 

UElJESepUBQ 




• 


b^|e unsodsqjEa 


SNOIlVHVd3Hd OIW3NVI±NV 


« 




3Sfl OIVM3±SASHOd S1VHIAI1NV 


• 
• 


3|0ZEU030i3» 

A ugipjuad 

3U!WEU3L)13|fl 


3sn 

OI1M31SAS HOd SOilOOAlfllllNV 

3sn 

OI1M31SAS HOd S1VIH310VailNV 




esoinpBi 


S3AI1VX\H 




suilueadsoAh 
3U!Luo|oAo!a 


SHSOHOSia "1VNI1S31NIOH1SVS 

ivNonoNfid HOd ssnua 


• 
• 


sinwjsiusnd 
1B1SHJO 


sionaoHd i3ia ~iox3 

'SNOIlVHVdSHd A±iS3a011NV 



NVOHO 
HOSN3S 



W31SAS 

AHOIV 

HldS3H 



H31SAS 

S 

HOAH3N 



H31SAS 
IV 

1313XS 

-O 

inosnw 



'"IAIHOH 
X3S 

am 

W31SAS 

HVNIHfl 
OUN30 
S1V 

oiooioi 

WIH30 



NVOHO 
O 

NIlMHOd 

aooia 

ONV 

aoona 
3sn oi 

H31SAS 
HOd 

S3AI103 
dNIUNV 



wsn 

03V13W 

ONV 

10VH1 

AHV1 

N3HITV 



CM i- T- T- t- t- 



lini d 



www.nature.com/psp 



Medication-Wide Association Studies 

Ryan etal. 




u|doj|ijAsoo 
aiozHuoojiAi 
suiEoozuag 
euiLueiodoDS 

oeuejujojg 

|0J919UI|BS 

9|Sjeuozusg 
su;pe|ejo~| 



SU!|BJ1J9S 

guiiexony 

UJBjdO|BH3S3 

Luejdo|e)|Q 

aujiuBiisosN 

eujBOOHJd 

sj!ujbio5j3 

IBSjunina 

u!iauj|oi 

weoixojy 

uaxojdef-j 

uejojdoie>| 

uejojdnqi 

uajojdjqjnu 

|oiueqjEoaifl9|A| 

auo|EXBie[A| 

q|xoDap|EA 

□Epu||ns 

UjZOjdEKO 

3uoi9LunqBN 

91BWBJ9|9|ftJ 
DE|0J0(9» 

upBq)9LUopu| 
uejojdousj 

DE|OpO)3 

LHuAjnqAxo 

UIOEliajUEQ 

auijeuiqjsj. 
ujAinjoasuo 
euisauspv 

|aj6op|do|o 
bjiv uiiaodg 
ujEJmnsia 

9|0ZEP!U!1 

9|0ZEpu9qgpj 
eulpnAopiz 

3U|pnA!UJE~| 
JjABOBqV 
9|OZEUO0OJ9» 
9|OZBJO0EJH 

A UNIPiuad 
upiJBjnfaJiiN 
upAujBpiJiia 
V uiujbha 
eso|niaB-| 

OUOZB(l|6lSOa 
.).J<.VI'|I t>ei c. 



"OINVIVHlOdAH QNV AHVllfllld 



sivoidoioio 



snvoioonomvHiHdo osn3s 

SNOUVHVdSHd "1VSVN 

■"hiv iAiioriHisao hoj senna ^yoiv S 

"oiivHVd3Hd aioo onv Honoo nidssn 
3sn 

0IW31SAS HOd S3NILW1SIHI1NV 



SDlld3~IOH3ASd 



SOIld31VNVOHOASd 



SOHHO VM31SAS SDOAH3N H3H±o 
S3LL3H1S3NV 



S1NVXV"I3H 310SniM 



sivoiooiodn 

S3AI±03±OHd ONV SlN3mOLN3 

3sn nvoioonoivHHna 
HOd snvsNndiiNV 

AdVH3H± OVIOHVO 

"-NIN3H 3H±NO DNIIOV S1N30V 

"dH3d am S3in±usans aoona 

S1N39V OllOaHOHHll±NV 
SNOIlVHVd3Hd OIW3NVI1NV 
"VOS "IONI 'S3ai0lllSVHVdOlO3 
SlVOZOlOHdllNV 
SOIlNim3H±NV 

3Sfl OIW31SAS HOd S1VHIALLNV 



SNIIMV1IA 
S3AI1VXV! 



sai3avia ni a3sn senna 



!)U Ujt:AJ!;oA| . 

guiwopAcMQ 

9|B||BJDnS 

9U|LUJ91U3qd 

)BJS||JO 

|0 u ! q b u u boo J p AqE J iei 



SH3aHOSia 1VNI1S31NIOH1SVO 

nvNonoNnd HOd sonna 
SH3adosia 

031V13H aiOV HOd SSHHO 

sionaoHdi3ia 10x3 

'SNOIlVHVd3Hd AllS3aOllNV 
"V3SnVN]lNV ONV S3LL3W3I1NV 



W31SAS o 
IV I 
1313MS E 



S3N 

OWHOH a> 

X3S " 

onv o 

W31SAS 0 ( 

* CO 



S~IVO 

iooioi 

VLNH30 
"'AS HV 

"inosvA 

OlOHVO 

"ooia 

ONV 

aoona 

""S3QI0 
I103SNI 
'SIO 

naoHd 

OI1ISVH 
VdllNV 

3sn 01 

W31SAS 
HOd S 
3AI103 
dNIUNV 



WSHO 

ONV 
10VH1 
AHV1 
N3LM1V 



z 



a < 



Zi Z Z _J 



||nj d 



CPT: Pharmacometrics & Systems Pharmacology 



Medication-Wide Association Studies 

Ryan eta/. 



significant statistical associations. Conversely, many drugs 
known to have effects do not signal despite the large size 
of the database. All the four plots in Figure 1 contain false- 
positives (circles) above the Bonferroni-corrected threshold 
of -0.0005, and three of the four have false-positives at the 
most significant P values. Therefore, the false-positives are 
not due to testing multiple hypotheses and we must consider 
sources of error such as confounding. For example, the very 
strong signal for hydrochlorothiazide causing acute renal 
failure may be due to its common coprescription in patients 
with renal impairment. The self-controlled design used in 
this analysis is only one of several alternative approaches 
that can be considered. While the SCCS explicitly addresses 
time-invariant confounding factors, such as gender, race, and 
genetics, it does not control for time-varying factors other 
than concomitant medication exposure. Other study design 
approaches include a new user cohort design, which uses 
an active comparator as a referent and estimates event rates 
during the time following initiation of treatment, and the case- 
control design, which compares exposure rates during the 
time before outcome incidence and compares with exposure 
rates among matched patients who did not experience the 
outcome. We present the results from the SCCS because 
this design has been demonstrated in OMOP's experiments 
to have higher predictive accuracy and lower bias than these 
alternative approaches. 18 Future work should be considered 
to determine how best to combine results across multiple 
analyses to improve our understanding of the effects of medi- 
cal products. 

If we group drugs by the organ system of their indications 
for each of the four side effects (drugs grouped by color in 
Figure 1), we found a tendency of the drugs to act similarly 
within groups. We found 28 groups where all drugs in the 
organ class were negative and no association was found 
and 5 in which there were drugs with known side effects and 
an association was found in more than half. Thus, 33 of 59 
groups were handled well by the algorithm. In some cases, 
such as the positive effects of nonsteroidal anti-inflammatory 
drugs and acute myocardial infarction, the consistency of the 
findings supports the observation of a potential effect. There 
were 15 groups in which most or all of the known drugs with 
true side effects were missed, 2 groups in which a significant 
proportion of the drugs known not to have a side effect were 
found to have an association, 7 groups with a single spurious 
false-positive association, and 2 groups with a combination 
of a spurious association and incomplete or nearly complete 
identification of true side effects. For example, despite the 
known increased risk of acute liver injury after exposure to 
antivirals, the consistent lack of observed association could 
falsely lead to a conclusion that there is no effect. The ten- 
dency of drugs to act similarly within groups probably reflects 
biases due to the health-care process, because in most 
cases, the drugs within a group are not structurally similar. 
Despite the presence of these patterns, no single pattern 
appeared to reliably identify a drug as a true- or false-positive. 



For example, a single association within a group could be 
spurious or true, and a preponderance of associations within 
a group could represent accurate identification, a run of false- 
positives, or a combination. 

Three of the graphs are notable for a lack of obvious con- 
founding by indication. Drugs with an indication that was 
related to the side effects — cardiovascular for myocardial 
infarction, urologic for renal failure, and alimentary track for 
ulcer — did not produce false-positive associations, so the self- 
controlled study appeared to work in these cases. For acute 
liver failure, however, the false-positive findings observed for 
alimentary track drugs may be due in some way to the effects 
or treatment of liver failure. 

One potential approach to addressing imperfect data is to 
combine evidence from disparate sources. Figure 3 shows 
two very different databases, derived from claims data and 
EHR data. Combining the two does not appear to help dis- 
criminate true signals from false ones; similar results were 
found for the other three side effects. We performed addi- 
tional experiments with two additional databases and found 
that multiple approaches to synthesize evidence across data- 
bases failed to improve discrimination. These results sug- 
gest that different health-care databases may exhibit similar 
biases, such that pharmacovigilance activities may require 
information sources beyond observational data to support the 
evaluation of safety signals. 

A P-value plot can be a useful test when each test can be 
considered as independent and unbiased. 30 You can deter- 
mine whether the number of significance tests is consistent 
with the unbiased, independence assumption by assessing 
whether the range of tests does not deviate from the 45% line. 
In the context of observational studies, we expect that results 
may be biased, and studies of the same outcome are likely 
correlated insofar as the sources of bias for a given outcome 
may be consistent across multiple drugs. This can be seen 
from the P-value plots of the negative controls (Figure 2), 
which show a disproportionate number of significant findings. 
For this reason, we argue that statistical significance using tra- 
ditional P values or multiplicity-adjusted thresholds are insuf- 
ficient, and instead rank-ordering effects based on P value, 
as we display in the MWAS plots, may be a more principled 
approach to triaging potential drug safety concerns. 

The MWAS approach of systematic exploration of struc- 
tured observational health-care claims and EHR databases is 
only one tool to complement other recent innovations toward 
improving the evidence base about the safety profile of medi- 
cal products. LePendu era/, have demonstrated how natural 
language processing of free text in medical records can be 
used to draw inferences about potential drug-side effect rela- 
tionships. 31 Harpaz era/, recently measured the performance 
of new algorithms for data mining in spontaneous adverse 
event reporting data and demonstrated that disparate data 
may have differential performance across health outcomes 
of interest. 32 Tatonetti et a/. 3334 and Duke et a/. 35 have suc- 
cessfully demonstrated the potential to go beyond studying 



Figure 1 Medication-wide association study (MWAS) analyses in Commercial Claims and Encounters (CCAE) database for (a) acute 
myocardial infarction, (b) acute liver injury, (c) acute kidney injury, and (d) upper gastrointestinal bleeding. V-axis displays Pvalues on the 
negative log scale. X-axis displays all the drugs studied for a given outcome, grouped by the Anatomical Therapeutic Chemical classification 
system. OMOP, Observational Medical Outcomes Partnership. 



www.nature.com/psp 



Medication-Wide Association Studies 

Ryan etal. 




CPT: Pharmacometrics & Systems Pharmacology 



Medication-Wide Association Studies 

Ryan eta/. 



Figure 2 P-value plots for negative controls, trellised by outcome. V-axis lists the P value for each drug-outcome pair and X-axis shows the 
percentile of the negative control drugs which have a P value at or below that P value. The black dashed line indicates the 45° line, which should 
approximate the P-value curves if the statistical tests were independent and unbiased. CCAE, Commercial Claims and Encounters; OMOP, 
Observational Medical Outcomes Partnership. 



the main effects to also explore drug-drug interactions in the 
same data, and to integrate the results of observational anal- 
ysis with other information sources, such as the published lit- 
erature and chemical structure ontologies. 

MWASs provide a structured approach for evaluating 
potential drug safety concerns across all products in a way 
that provides the necessary context for interpreting any one 
drug-safety question of interest. While these illustrations 
focus on a defined set of negative and positive control test 
cases for methodological purposes, we believe this graphical 
representation provides a consistent framework that can be 
applied to all drugs and outcomes as a means to assess the 
drug-outcome pairs for which we are still uncertain about the 
true extent of the potential relationship. That context involves 
understanding how unique a particular observation is by see- 
ing how many other drugs yielded similar effects, and also 
involves seeing how consistent findings are with medical 
products that share similar characteristics. Further context 
is provided by evaluating an association through replication 
within two or more data sources. In this regard, the MWAS 
visualization using an SCCS analysis across multiple data- 
bases provides a framework that embodies several of the 
elements required for evaluating a potential causal effect, 
including strength of association, consistency, temporality, 
specificity, and coherence. 36 Observational health-care data 
alone may not be sufficient to provide definitive evidence of 
any purported effect; however, systematic analysis of these 
data offers tremendous potential in providing credible evi- 
dence for advancing our understanding of the effects of medi- 
cal products across large populations and a wide variety of 
products. 

METHODS 

We conducted this analysis in two observational health-care 
databases, the Truven MarketScan CCAE administrative 
claims database and the GE Centricity EHR database. 18 
CCAE represents a privately insured population and captures 
inpatient and outpatient medical claims and pharmacy claims 
of multiple insurance plans. The database used in this analy- 
sis contained 46.5 million lives with >97.6 million patient- 
years of observation from 2003 to 2009. We defined periods 
of drug exposure based on pharmacy dispensing records and 
procedural administrations. The GE MQIC (Medical Quality 
Improvement Consortium) represents the group of provid- 
ers who use the GE Centricity Electronic Medical Record 
and who contribute their data for secondary analytic use. 
The GE MQIC database reflects events in usual care, includ- 
ing patient problem lists, prescribing patterns and over-the- 
counter use of medications, and other clinical observations as 
experienced in the ambulatory care setting. GE contains 1 1 .2 
million lives with data from 1996 to 2008. Drug exposures 
were inferred from medication history and prescriptions writ- 
ten. For both databases, we applied standardized algorithms 
to define acute myocardial infarction, acute liver failure, acute 



renal failure, and upper gastrointestinal bleeding based on 
diagnosis codes on patient and outpatient medical claims. 37 

For each outcome, we identified a set of negative and 
positive controls. Ground truth was established based on sys- 
tematic literature review and natural language processing of 
structured product labeling, with positive controls identified as 
drugs with Boxed Warnings or Precautions that are supported 
by published evidence with no conflicting published studies, 
and negative controls defined as drugs with no evidence sug- 
gesting an association in either labeling or literature. 38 Drugs 
with inconsistent evidence were excluded. The MWAS plots 
shown in Figure 1 display the full set of negative and posi- 
tive controls for each outcome that were tested as part of the 
OMOP experiment. The specific number of drugs varies by 
outcome; 118 drugs were studied for acute liver injury, 102 
for acute myocardial infarction, 88 for acute renal failure, and 
91 for gastrointestinal bleeding. Analyses were performed on 
RxNorm ingredient concepts. RxNorm concepts were clas- 
sified using the Anatomical Therapeutic Chemical hierarchy 
only for presentation purposes, but this classification does 
not affect the effect estimation procedure. The RxNorm-to- 
Anatomical Therapeutic Chemical mapping is part of the 
OMOP vocabulary model and was created by and licensed 
from FirstDataBank. The entire OMOP vocabulary is publicly 
available online (http://omop.org/CDMvocabV4). 

For each drug-outcome pair, we performed an SCCS 
analysis, 3940 which compares the event rate during time-at- 
risk with the rate during the time unexposed among patients 
who had at least one exposure and one outcome record. We 
defined time-at-risk as the all-time postexposure start, includ- 
ing the index date when treatment was initiated and continuing 
through the end of the patient's observation period. All time 
before starting the drug exposure is considered as the unex- 
posed period. We included all occurrences of outcome. We 
applied a regularized implementation of the SCCS model, 41 
with the regularization parameter determined by crossvalida- 
tion, and we did multivariate adjustment for time-varying con- 
comitant medications. The multivariate SCCS implementation 
uses all RxNorm ingredients as potential covariates in the 
model. Only those RxNorm ingredients which are observed 
in patients with exposure to the target drug and an occur- 
rence of the event are actually fit within each model. Each 
analysis produced an incidence rate ratio, 95% confidence 
interval, and Pvalue. The MWAS plot displays the P value on 
the negative log scale across all drugs for the same outcome. 
Drugs are grouped according to the Anatomical Therapeu- 
tic Chemical classification system. The source code of the 
SCCS implementation used to produce this analysis is pub- 
licly available online (http://omop.org/MethodsLibrary). The 
entire result set of all methods executions across a network 
of observational databases for all drug-outcome test cases is 
also publicly available online (http://omop.org/Research). 

Only fully deidentified data sets were used in the study and 
only aggregate-level data are reported, so the review by Insti- 
tutional Review Board was not required. 



www.nature.com/psp 



Medication-Wide Association Studies 

Ryan etal. 



10 



Z CO 3 o CO j- ; 

< EC Q U- > Eg 

5 S g 8 S 3 g i 

S fa S I s 3 £ 



^ i 5 I 



3 g 
■ o * 



^ = 0=OQStO>_|i iC OH 
Sj-l-OCCCCZfOcC-ICOZCO 
_i2ZZj<LJJIU=)11JZ3LU|JJ^ 

<<<ddoqo2zzd:coco 

■■■■■■■■□■■■a 



3 I 



•* 



AdVH3H± aiOtlAH± 
"HLOdAH ONV AdVllfllld 

sivoiooioio 



snvoi oonomvHiHdo 



NVGHO 
Ad 

OSN3S 



i 



Disnqojouj oifiqdsoqd ujnipas 
esedn 

sssEpudedopug 
sasB|AuJV 
LuedBzeujai 
jo an s hie h 

UUI/DJOdjO|L|30Jd 

lopyodoiQ 

oiBdozejoiu.3 

sui|A)du|jorg 

UUIUJEjdlSUQ 

uuopiujud 
auieooiud 

UEldlOIUJIOZ 

ueidiqeujng 

puu 3i|A3i|ES|Aones 

if-;<Ji.jl.v h 
ji:;Juit n'N 
ueidiqeAojj 
uBidmoig 
lesiunuja 

ii , :i i . -v 

UQQUIIOl 

uaiojdotax 
uaiojdiqjnij 
loujeqjesoLito^i 

■i.. '••'.i'W 
uuoioujfiqtN 

DD|O,0tU>| 
UI3EU1QUJOPUI 

uaicudouaj 

□E|OP013 

uiuAinqAxo 
uibkoabi j 

UI3BUQJUBQ 

aiEdidoiisg 

(dSn) P3|K6nluo3 suo6ojis3 

|0!pejis3 

Eajn 

9U!)EJ|qjei 
3J|d!ps||N 
eiudipoiuiv 

|MdjX30|/\| 

a|oiuEpuAd|G 
9[EUD3n|6 snojjsj 
ejiv uiieodg 
bj|e ujisodsqjEa 

9|0ZEp|U!l 

eiozEpusqsvg 

3UldEJ!A9N 

a|ozEuoooja>j 

A ui||iJiuJ,H 
OuiiUDUOqlQW 
i . ■ ii : ■ ; 
V 1 I I.I MA 

usoinwtn 
uu=i|Ci>tis 
|. • v 
ouoaiqtoujig 
uuiujoA3SoAh 
uunuopAoiQ 
oiEjiejang 

| OU iq E U U UDO J pAlJ BJ|01 
9J!ZE|ESEJ|nS 



SNOIlVHVd3Hd "1VSVN 

"avmhiv 3Aiiondisao 
yod sonna 

"d3dd cnoo am Honoo 

3Sfl OIW31SAS 

yOd S3NILW1SIHI1NV 



SOIldSlOHOASd 



SOIldSlVNVOHOASd 

sonya NOSNMHVd-iiNV 

S0lld3Hd3llNV 
S0LL3H1S3NV 



31SAS 

Ayoiv 
yidS3y 



NlVd 

y vmosnw am iNior 
yod sionaoud nvoidoi 

siNvxvisa 3iosntM 



sionaodd 

OllVWn3HHIlNV 
aNVAUOJWILWIdNILLNV 



sivoiooioyn 

n 3 IS AS 1V1IN3S 

3Hido SHOivmaon 
am s3No^yoH xss 

S3All0310yd 

am s±n3itiovm3 

■"H3a yOd SIVONfldllNV 



■"HI NO 9N110V S1N30V 
S1N39V 

oiioawoyHiiiNV 

SNOIlVHVd3yd 
OIW3NVI1NV 

snvozoioydiiNV 

SOIlNim3H±NV 
"31SAS HOd STsTdlAllNV 
3Sfl OIW31SAS 
yOd SOI1O0AVMI1NV 
3SD OIW31SAS 

yod snviy3iovaiiNV 



SNII/WIIA 
S3AI1VXVH 



sy3ayosia 

1VNIlS3±NIOyiSVS 

nvNonoNnd yod sonya 
sy3ayosia agivniu 
aiov yodsanya 

■|±NV ONV SOI13N3I1NV 

■■s3iNi'snv3HyyviaiiNV 



S3N 

owyoH 

X3S 

am w 

31SAS 



S1VO 
IS0101 

vwy3a 

"ISAS 

yvn 

nosvA 

oiayvo 

"wyod 

aoona 

ONV 

aooia 

""I 'SIO 

naoyd 
oui svy 

VdllNV 

3sn 

QllM 
31SAS 

yod s 

3AI103 
dNIUNV 



wsno 

8V13VM 

am 
iovyi 

AHVl 
N3WI1V 



CPT: Pharmacometrics & Systems Pharmacology 



Medication-Wide Association Studies 

Ryan eta/. 

n 



Figure 3 Comparison between Commercial Claims and Encounters (CCAE) and GE databases of medication-wide association study (MWAS) 
analyses for acute myocardial infarction. V-axis displays P values on the negative log scale. X-axis displays all the drugs studied for a given 
outcome, grouped by the Anatomical Therapeutic Chemical classification system. OMOP, Observational Medical Outcomes Partnership. 



Acknowledgments. G.H. was funded by a grant from the Na- 
tional Library of Medicine, "Discovering and applying knowl- 
edge in clinical databases" (R01 LM006910). P.R., M.S., P.S., 
and D.M. are research investigators for the Observational 
Medical Outcomes Partnership (OMOP). PR., M.S., and RS. 
are employees of Janssen Research and Development, and 
do not receive funding for their participation in OMOP. D.M. 
receives funding through the Foundation of the National In- 
stitutes of Health. 

Author contributions. PR., DM., P.S., M.S., and G.H. wrote 
the manuscript. PR., D.M, P.S., M.S., and G.H. designed the 
research. PR., D.M., RS, M.S., and G.H. performed the re- 
search. P.R., M.S., and G.H. analyzed the data. P.R., DM., 
P.S., and M.S. contributed new reagents/analytical tools. 

Conflict of interest. The authors declared no conflicts of 
interest. 



Study Highlights 

WHAT IS THE CURRENT KNOWLEDGE ON THE 
TOPIC? 

</ Undiscovered drug side effects can have a pro- 
found effect on the health of the nation, and 
electronic health-care databases offer opportu- 
nities to speed up the discovery of these effects. 

WHAT QUESTION THIS STUDY ADDRESSED? 

S How can we better visualize and interpret the 
results of large-scale association studies of 
drug side effects using claims and clinical da- 
tabases? 



WHAT THIS STUDY ADDS TO OUR KNOWLEDGE? 

S We created a "medication-wide association 
study", which combined statistical association 
with hierarchical information about the structure 
and function of drugs. The visualization high- 
lighted class effects, which not only strength- 
ened the review of specific products but also 
underscored the challenges in confounding. 

HOW THIS MIGHT CHANGE CLINICAL 
PHARMACOLOGY AND THERAPEUTICS? 

S These findings confirm that observational data- 
base analyses are useful for identifying potential 
associations that warrant further consideration 
but are unlikely to provide definitive evidence of 
causal effects. 



1 . Blumenthal, D. & Tavenner, M. The "meaningful use" regulation for electronic health 
records. N. Engl. J. Med. 363, 501-504 (2010). 

2. Access to CMS Data & Application. <http://www.cms.gov/Research-Statistics-Data-and- 
Systems/CMS-lnformation-Technology/AccesstoDataApplication/index.html> Accessed 20 
January 2013. 

3. US Food and Drug Administration (FDA). Adverse Event Reporting System (AERS). 
<http://www.fda.gov/cder/aers> Accessed 20 January 201 3. 

4. Friedman, CP., Wong, A.K. & Blumenthal, D. Achieving a nationwide learning health 
system. Sci. Transl. Med. 2, 57cm29 (2010). 

5. World Health Organization. The importance of pharmacovigilance— safety monitoring 
of medicinal products. World Health Organization: Geneva, <http://apps.who.int/ 
medicinedocs/en/d/Js4893e/> (2002) Accessed 20 January 2013. 

6. Lazarou, J., Pomeranz, B.H. & Corey, P.N. Incidence of adverse drug reactions in 
hospitalized patients: a meta-analysis of prospective studies. JAMA 279, 1200-1205 
(1998). 

7. Classen, D.C., Pestotnik, S.L., Evans, R.S., Lloyd, J.F. & Burke, J.P. Adverse drug events 
in hospitalized patients. Excess length of stay, extra costs, and attributable mortality. 
JAMA2TI, 301-306(1997). 

8. Ahmad, S.R. Adverse drug event monitoring at the Food and Drug Administration. J. Gen. 
Intern. Med. 18,57-60 (2003). 

9. Atkins, D. ef a/.; GRADE Working Group. Grading quality of evidence and strength of 
recommendations. SMJ328, 1490 (2004). 

1 0. Berlin, J.A., Glasser, S.C. & Ellenberg, S.S. Adverse event detection in drug development: 
recommendations and obligations beyond phase 3. Am. J. Public Health 98, 1 366-1 371 
(2008). 

11. Waller, P.C. & Evans, S.J. A model for the future conduct of pharmacovigilance. 
Pharmacoepidemiol. Drug Saf. 12, 17-29 (2003). 

12. Harpaz, R., DuMouchel, W., Shah, N.H., Madigan, D., Ryan, P. & Friedman, C. Novel 
data-mining methodologies for adverse drug event discovery and analysis. Clin. 
Pharmacol. Ther.91, 1010-1021 (2012). 

13. Walker, A.M. Confounding by indication. Epidemiology 7, 335-336 (1996). 

14. Rosenbaum, P.R. & Rubin D.B. The central role of the propensity score in observational 
studies for causal effects. BiometrikalQ, 41-55 (1983). 

15. Wang, X., Hripcsak, G. & Friedman, C. Characterizing environmental and phenotypic 
associations using information theory and electronic health records. BMC Bioinformatics 
10 (suppl. 9), S13 (2009). 

1 6. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and 
powerful approach to multiple testing. J. R. Stat. Soc. Series B Stat. Methodol. 57, 
289-300 (1995). 

1 7. Stang, P.E. ef al. Advancing the science for active surveillance: rationale and design 
for the Observational Medical Outcomes Partnership. Ann. Intern. Med. 153, 600-606 
(2010). 

18. Ryan, P.B., Madigan, D., Stang, P.E., Overhage, J.M., Racoosin, J.A. & Hartzema, A.G. 
Empirical assessment of methods for risk identification in healthcare data: results from 
the experiments of the Observational Medical Outcomes Partnership. Sfat. Med. 31 , 
4401^1415(2012). 

19. Arellano, F.M. The withdrawal of rofecoxib. Pharmacoepidemiol. Drug Sat 14, 213-217 
(2005). 

20. Lependu, P., Iyer, S.V., Fairon, C. & Shah, N.H. Annotation Analysis for Testing Drug 
Safety Signals using Unstructured Clinical Notes. J. Biomed. Semantics! (suppl. 1), S5 
(2012). 

21 . Brownstein, J.S., Sordo, M., Kohane, I.S. & Mandl, K.D. The tell-tale heart: population- 
based surveillance reveals an association of rofecoxib and celecoxib with myocardial 
infarction. PLoS ONE 2, e840 (2007). 

22. Brown, J.S. et al. Early detection of adverse drug events within population-based health 
networks: application of sequential testing methods. Pharmacoepidemiol. Drug Saf. 16, 
1275-1284(2007). 

23. Schuemie, M.J. ef al. Using electronic health care records for drug safety signal detection: 
a comparative evaluation of statistical methods. Med. Care SO, 890-897 (2012). 

24. Ikram, M.A. ef al. Genomewide association studies of stroke. N. Engl. J. Med. 360, 
1718-1728 (2009). 

25. Denny, J.C. ef al. PheWAS: demonstrating the feasibility of a phenome-wide scan to 
discover gene-disease associations. Bioinformatics 26, 1 205-1 21 0 (201 0). 

26. International Classification of Diseases, 9th Revision, Clinical Modification. <http://www. 
cdc.gov/nchs/icd/icd9cm.htm> Accessed 21 January 2013. 

27. WHO Collaborating Centre for Drug Statistics Methodology. Anatomical therapeutic 
chemical classification system: structure and principles, <http://www.whocc.no/atc/ 
structure_and_principles/> Accessed 21 January 2013. 

28. Food and Drug Administration Amendments Act of 2007. Public Law 1 1 0-85, 21 STAT. 
823 (2007). 

29. Robb, M.A. ef al. The US Food and Drug Administration's Sentinel Initiative: expanding 
the horizons of medical product safety. Pharmacoepidemiol. Drug Saf. 21 (suppl. 1), 9-1 1 
(2012). 



www.nature.com/psp 



Medication-Wide Association Studies 

Ryan etal. 



12 



30. Schweder, T & Spjotvoll, E. Plots of P-values to evaluate many tests simultaneously. 
Biometrika 69, 493-502 (1982). 

31 . LePendu, P. et a/. Pharmacovigilance using clinical notes. Clin. Pharmacol. Ther. 93, 
547-555(2013). 

32. Harpaz, R., DuMouchel, W., LePendu, P., Bauer-Mehren, A., Ryan, P. & Shah, N.H. 
Performance of pharmacovigilance signal-detection algorithms for the FDA adverse 
event reporting system. Clin. Pharmacol. Ther. 93, 539-546 (2013). 

33. Tatonetti, N.P., Ye, P.P., Daneshjou, R. & Altman, R.B. Data-driven prediction of drug 
effects and interactions. Sci. Transl. Med. 4, 125ra31 (2012). 

34. Tatonetti, N.P., Fernald, G.H. & Altman, R.B. A novel signal detection algorithm for 
identifying hidden drug-drug interactions in adverse event reports. J. Am. Med. Inform. 
Assoc. 19,79-85(2012). 

35. Duke, J.D. ef al. Literature based drug interaction prediction with clinical assessment using 
electronic medical records: novel myopathy associated drug interactions. PLoS Comput. 
B/o/,8, e1002614 (2012). 

36. Hill, A.B. The environment and disease: association or causation? Proc. R. Soc. Med. 58, 
295-300(1965). 

37. Observational Medical Outcomes Partnership, Health Outcomes of Interest Library. 
<http://omop.org/HOI> Accessed 24 July 2013. 



38. 



39. 



40. 



Tisdale, J. & Miller, D. In Drug-Induced Diseases: Prevention, Detection, and 
Management 2nd edn. (Bethesda, MD: American Society of Health-System Pharmacists, 

2010) . 

Whitaker, H.J., Hocine, M.N. & Farrington, CP. The methodology of self-controlled case 

series studies. Sfaf. Methods Med. Res. 18, 7-26 (2009). 

Whitaker, H.J., Farrington, CP., Spiessens, B. & Musonda, P. Tutorial in biostatistics: 

the self-controlled case series method. Stat. Med. 25, 1768-1797 (2006). 

Madigan, D., Ryan, P., Simpson, S. & Zorych, I. Bayesian methods in pharmacovigilance. 

In Bayesian Statistics 9. (eds. Bernardo, J.M. etal} (Oxford, UK Oxford University Press, 

2011) . 



fcc>©®© Pharmacometrlcs & Systems Pharmacology is an 
'«•■«•■ open-access journal published by Nature Publishing 



Group. This work is licensed under a Creative Com- 
mons Attribution-NonCommercial-NoDerivative Works 3.0 License. 
To view a copy of this license, visit http://creativecommons.org/ 
licenses/by-nc-nd/3.0/ 



CPT: Pharmacometrics & Systems Pharmacology 



